Machine Learning Methods with Noisy, Incomplete or Small Datasets
نویسندگان
چکیده
In this article, we present a collection of fifteen novel contributions on machine learning methods with low-quality or imperfect datasets, which were accepted for publication in the special issue “Machine Learning Methods Noisy, Incomplete Small Datasets”, Applied Sciences (ISSN 2076-3417). These papers provide variety approaches to real-world problems where available datasets suffer from imperfections such as missing values, noise artefacts. Contributions applied sciences include medical applications, epidemic management tools, methodological work, and industrial among others. We believe that will bring new ideas solving challenging problem, clear examples application scenarios.
منابع مشابه
Machine Learning with Large Datasets
This paper introduces new algorithms and data structures for quick counting for machine learning datasets. We focus on the counting task of constructing contingency tables, but our approach is also applicable to counting the number of records in a dataset that match conjunctive queries. Subject to certain assumptions, the costs of these operations can be shown to be independent of the number of...
متن کاملAuthoritative Citation KNN Learning with Noisy Training Datasets
In this paper, we investigate the effectiveness of Citation K-Nearest Neighbors (KNN) learning with noisy training datasets. We devise an authority measure associated with each training instance that changes based on the outcome of Citation KNN classification. The authority is increased when a citer’s classification had been right; and vice versa. We show that by modifying only these authority ...
متن کاملPerformance of Various Machine Learning Classifiers on Small Datasets with Varying Dimensionalities: A Study
Classification is an important supervised learning technique that is used by many applications. An important factor on which the performance of a classifier depends is the size of the dataset using which the classifier is going to be trained. In this manuscript the authors have analyzed five different classification techniques (namely decision trees, KNN, SVM, linear discriminant and Ensemble m...
متن کاملEffects of information and machine learning algorithms on word sense disambiguation with small datasets
Current approaches to word sense disambiguation use (and often combine) various machine learning techniques. Most refer to characteristics of the ambiguity and its surrounding words and are based on thousands of examples. Unfortunately, developing large training sets is burdensome, and in response to this challenge, we investigate the use of symbolic knowledge for small datasets. A naïve Bayes ...
متن کاملCached Sufficient Statistics for Efficient Machine Learning with Large Datasets
This paper introduces new algorithms and data st.ruct,ures for quick rounting for machine learning dat.asets. We focus on t,he counting task of constructing contingent:. t.ables, but our approach is also applicahlc t.o counting the number of records in a dataset that match conjunctive queries. Subject to certain assumptionsl t h c rosts of thesr operations ca,n he shown to be independent of the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2021
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app11094132